Goto

Collaborating Authors

 Harrisburg


Three Mile Island nuclear plant makes comeback with 1B in federal backing to meet increasing energy demands

FOX News

Microsoft and Constellation Energy partner to restart Three Mile Island nuclear reactor with $1 billion federal loan to power artificial intelligence operations.


Unsupervised decoding of encoded reasoning using language model interpretability

Fang, Ching, Marks, Samuel

arXiv.org Artificial Intelligence

As large language models become increasingly capable, there is growing concern that they may develop reasoning processes that are encoded or hidden from human oversight. To investigate whether current interpretability techniques can penetrate such encoded reasoning, we construct a controlled testbed by fine-tuning a reasoning model (DeepSeek-R1-Distill-Llama-70B) to perform chain-of-thought reasoning in ROT-13 encryption while maintaining intelligible English outputs. We evaluate mechanistic interpretability methods--in particular, logit lens analysis--on their ability to decode the model's hidden reasoning process using only internal activations. We show that logit lens can effectively translate encoded reasoning, with accuracy peaking in intermediate-to-late layers. Finally, we develop a fully unsupervised decoding pipeline that combines logit lens with automated paraphrasing, achieving substantial accuracy in reconstructing complete reasoning transcripts from internal model representations. These findings suggest that current mechanistic interpretability techniques may be more robust to simple forms of encoded reasoning than previously understood. Our work provides an initial framework for evaluating interpretability methods against models that reason in non-human-readable formats, contributing to the broader challenge of maintaining oversight over increasingly capable AI systems.


Machine learning-based cloud resource allocation algorithms: a comprehensive comparative review

Bodra, Deep, Khairnar, Sushil

arXiv.org Artificial Intelligence

Cloud resource allocation has emerged as a major challenge in modern computing environments, with organizations struggling to manage complex, dynamic workloads while optimizing performance and cost efficiency. Traditional heuristic approaches prove inadequate for handling the multi-objective optimization demands of existing cloud infrastructures. This paper presents a comparative analysis of state-of-the-art artificial intelligence and machine learning algorithms for resource allocation. We systematically evaluate 10 algorithms across four categories: Deep Reinforcement Learning approaches, Neural Network architectures, Traditional Machine Learning enhanced methods, and Multi-Agent systems. Analysis of published results demonstrates significant performance improvements across multiple metrics including makespan reduction, cost optimization, and energy efficiency gains compared to traditional methods. The findings reveal that hybrid architectures combining multiple artificial intelligence and machine learning techniques consistently outperform single-method approaches, with edge computing environments showing the highest deployment readiness. Our analysis provides critical insights for both academic researchers and industry practitioners seeking to implement next-generation cloud resource allocation strategies in increasingly complex and dynamic computing environments.


That New Hit Song on Spotify? It Was Made by A.I.

The New Yorker

That New Hit Song on Spotify? Aspiring musicians are churning out tracks using generative artificial intelligence. Some are topping the charts. Nick Arter, a thirty-five-year-old in Washington, D.C., never quite managed to become a professional musician the old-fashioned way. He grew up in Harrisburg, Pennsylvania, in a music-loving family.


Explaining Fine Tuned LLMs via Counterfactuals A Knowledge Graph Driven Framework

Wang, Yucheng, Chen, Ziyang, Kabir, Md Faisal

arXiv.org Artificial Intelligence

The widespread adoption of Low-Rank Adaptation (LoRA) has enabled large language models (LLMs) to acquire domain-specific knowledge with remarkable efficiency. However, understanding how such a fine-tuning mechanism alters a model's structural reasoning and semantic behavior remains an open challenge. This work introduces a novel framework that explains fine-tuned LLMs via counterfactuals grounded in knowledge graphs. Specifically, we construct BioToolKG, a domain-specific heterogeneous knowledge graph in bioinformatics tools and design a counterfactual-based fine-tuned LLMs explainer (CFFTLLMExplainer) that learns soft masks over graph nodes and edges to generate minimal structural perturbations that induce maximum semantic divergence. Our method jointly optimizes structural sparsity and semantic divergence while enforcing interpretability preserving constraints such as entropy regularization and edge smoothness. We apply this framework to a fine-tuned LLaMA-based LLM and reveal that counterfactual masking exposes the model's structural dependencies and aligns with LoRA-induced parameter shifts. This work provides new insights into the internal mechanisms of fine-tuned LLMs and highlights counterfactual graphs as a potential tool for interpretable AI.


E-CaTCH: Event-Centric Cross-Modal Attention with Temporal Consistency and Class-Imbalance Handling for Misinformation Detection

Mousavi, Ahmad, Abdollahinejad, Yeganeh, Corizzo, Roberto, Japkowicz, Nathalie, Boukouvalas, Zois

arXiv.org Artificial Intelligence

Detecting multimodal misinformation on social media remains challenging due to inconsistencies between modalities, changes in temporal patterns, and substantial class imbalance. Many existing methods treat posts independently and fail to capture the event-level structure that connects them across time and modality. We propose E-CaTCH, an interpretable and scalable framework for robustly detecting misinformation. If needed, E-CaTCH clusters posts into pseudo-events based on textual similarity and temporal proximity, then processes each event independently. Within each event, textual and visual features are extracted using pre-trained BERT and ResNet encoders, refined via intra-modal self-attention, and aligned through bidirectional cross-modal attention. A soft gating mechanism fuses these representations to form contextualized, content-aware embeddings of each post. To model temporal evolution, E-CaTCH segments events into overlapping time windows and uses a trend-aware LSTM, enhanced with semantic shift and momentum signals, to encode narrative progression over time. Classification is performed at the event level, enabling better alignment with real-world misinformation dynamics. To address class imbalance and promote stable learning, the model integrates adaptive class weighting, temporal consistency regularization, and hard-example mining. The total loss is aggregated across all events. Extensive experiments on Fakeddit, IND, and COVID-19 MISINFOGRAPH demonstrate that E-CaTCH consistently outperforms state-of-the-art baselines. Cross-dataset evaluations further demonstrate its robustness, generalizability, and practical applicability across diverse misinformation scenarios.


Enhancing Clinical Text Classification via Fine-Tuned DRAGON Longformer Models

Yang, Mingchuan, Huang, Ziyuan

arXiv.org Artificial Intelligence

This study explores the optimization of the DRAGON Longformer base model for clinical text classification, specifically targeting the binary classification of medical case descriptions. A dataset of 500 clinical cases containing structured medical observations was used, with 400 cases for training and 100 for validation. Enhancements to the pre - trained joeranbosma/dragon - longformer - base - mixed - domain model included hyperparameter tuning, domain - specific preprocessing, and architectural adjustments. Key modifications involved increasing sequence length from 512 to 1024 tokens, adjusting learning rates from 1e - 05 to 5e - 06, extending training epochs from 5 to 8, and incorporating specialized medical terminology. The optimized model achieved notable performance gains: accuracy improved from 72.0% to 85.2%, precision from 68.0% to 84.1%, recall from 75.0% to 86.3%, and F1 - score from 71.0% to 85.2%. Statistical analysis confirmed the significance of these improvements (p < .001). The model demonstrated enhanced capability in interpreting medical terminology, anatomical measurements, and clinical observations. These findings contribute to domain - specific language model research and offer practical implications for clinical natural language processing applications. The optimized model ' s strong performance across diverse medical conditions underscores its potential for broad use in healthcare settings. Enhancing Clinical Text Classification via Fine - Tuned DRAGON Longformer Models Introduction Natural language processing (NLP) in healthcare has continued to advance rapidly, revolutionizing the ability to analyze clinical texts and automate the extraction of valuable insights from massive amounts of medical documentation (Khurana, Koli, Khatter, & Singh, 2023). Over the past few years, large language models (LLMs) have emerged as powerful tools for gaining insight from and processing clinical narratives, creating capabilities that have never been seen before in medical text classification, entity recognition, and clinical decision support (Wang et al., 2018). The DRAGON (Deep Representation Analysis for General - domain Ontology Networks) framework was a specialized version of medical text processing out of all these models (Bosma et al., 2025). Beltagy, Peters, and Cohan (2020) state that the DRAGON longformer model, built on top of the Longformer architecture, addresses the quadratic computational complexity issue of traditional transformer models by processing long sequences.


Advancing 3D Medical Image Segmentation: Unleashing the Potential of Planarian Neural Networks in Artificial Intelligence

Huang, Ziyuan, Huggins, Kevin, Bellur, Srikar

arXiv.org Artificial Intelligence

Author Note Correspondence concerning this article should be addressed to Ziyuan Huang, University of Massachusetts Chan Medical School, 368 Plantation Street, Worcester, MA 01605 . Advancing 3D Medical Image Segmentation: Unleashing the Potential of Planarian Neural Networks in Artificial Intelligence Abstract Our study presents PNN - UNet as a method fo r constructing deep neural networks that replicate the planarian neural network (PNN) structure in the context of 3D medical image data. Planarians typically have a cerebral structure comprising two neural cords, where the cerebrum acts as a coordinator, and the neural cords serve slightly different purposes within the organism's neurological system. Accordingly, PNN - UNet comprises a D eep - UNet and a W ide - UNet as the nerve cords, with a densely connected autoencoder performing the role of the brain. This dist inct architecture offers advantages over both monolithic (UNet) and modular networks (Ensemble - UNet). Our outcomes on a 3D MRI hippocampus dataset, with and without data augmentation, demonstrate that PNN - UNet outperforms the baseline UNet and several othe r UNet variants in image segmentation. Introduction Medical image segmentation using deep learning techniques plays an increasingly crucial role in assisting clinical diagnosis. Every day, hospitals capture exponentially more medical images, making it increasingly difficult to process big data efficiently and effectively. Medical imaging segmentation can be classified into three major categories: 2D, 2.5D, and 3D (Minaee et al., 2021; Zhang et al., 2022) . The 2D method is to segment 3D images slice - by - slice, utilizing 2D slices as training and testing data. For the 2.5D category, segmentation algorithms usually segment 3D images slice - by - slice, adding neighboring slices as additional inputs. Lastly, 3D im ages are cropped and segmented into small cubic images for training and testing. It is important to note that different methods have their advantages and disadvantages in 3D medical image segmentation.


ADAM-1: AI and Bioinformatics for Alzheimer's Detection and Microbiome-Clinical Data Integrations

Huang, Ziyuan, Sekhon, Vishaldeep Kaur, Guo, Ouyang, Newman, Mark, Sadeghian, Roozbeh, Vaida, Maria L., Jo, Cynthia, Ward, Doyle, Bucci, Vanni, Haran, John P.

arXiv.org Artificial Intelligence

The Alzheimer's Disease Analysis Model Generation 1 (ADAM) is a multi-agent large language model (LLM) framework designed to integrate and analyze multi-modal data, including microbiome profiles, clinical datasets, and external knowledge bases, to enhance the understanding and detection of Alzheimer's disease (AD). By leveraging retrieval-augmented generation (RAG) techniques along with its multi-agent architecture, ADAM-1 synthesizes insights from diverse data sources and contextualizes findings using literature-driven evidence. Comparative evaluation against XGBoost revealed similar mean F1 scores but significantly reduced variance for ADAM-1, highlighting its robustness and consistency, particularly in small laboratory datasets. While currently tailored for binary classification tasks, future iterations aim to incorporate additional data modalities, such as neuroimaging and biomarkers, to broaden the scalability and applicability for Alzheimer's research and diagnostics.


Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding

Wang, Zhaokai, Zhu, Xizhou, Yang, Xue, Luo, Gen, Li, Hao, Tian, Changyao, Dou, Wenhan, Ge, Junqi, Lu, Lewei, Qiao, Yu, Dai, Jifeng

arXiv.org Artificial Intelligence

Image pyramids are widely adopted in top-performing methods to obtain multi-scale features for precise visual perception and understanding. However, current image pyramids use the same large-scale model to process multiple resolutions of images, leading to significant computational cost. To address this challenge, we propose a novel network architecture, called Parameter-Inverted Image Pyramid Networks (PIIP). Specifically, PIIP uses pretrained models (ViTs or CNNs) as branches to process multi-scale images, where images of higher resolutions are processed by smaller network branches to balance computational cost and performance. To integrate information from different spatial scales, we further propose a novel cross-branch feature interaction mechanism. To validate PIIP, we apply it to various perception models and a representative multimodal large language model called LLaVA, and conduct extensive experiments on various tasks such as object detection, segmentation, image classification and multimodal understanding. PIIP achieves superior performance compared to single-branch and existing multi-resolution approaches with lower computational cost. When applied to InternViT-6B, a large-scale vision foundation model, PIIP can improve its performance by 1%-2% on detection and segmentation with only 40%-60% of the original computation, finally achieving 60.0 box AP on MS COCO and 59.7 mIoU on ADE20K. For multimodal understanding, our PIIP-LLaVA achieves 73.0% accuracy on TextVQA and 74.5% on MMBench with only 2.8M training data. Our code is released at https://github.com/OpenGVLab/PIIP.